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W 

■ A model of correlated random networks is examined, i.e. networks with correlations between 

| the degrees of neighboring nodes. These nodes do not necessarily have to be direct neighbors, the 

, maximum range of the correlations can be arbitrarily chosen. Two different methods for the creation 

of such networks are presented: one of them is a generalization of a well-known algorithm by Maslov 
and Sneppen. The percolation threshold for the model is calculated and the result is tested using 
analytically solvable examples and simulations. In the end the principal importance of correlations 
and clustering for the topology of networks is discussed. Using a straight-forward extension of the 
network model by Barabasi and Albert, it is shown how a clustering-coefficient independent of the 
network size can originate in growing networks. 
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I. INTRODUCTION 



Scale-free networks, i.e. networks with essentially power-law degree distributions, have recently been widely studied 
(see Q and Q for reviews). Such degree distributions have been found in many different contexts, for example in 
several technological webs like the Internet the WWW £|, or electrical power grids, and also in social networks, 
' like the network of sexual contacts p| or one of the phone calls . 

The standard model reproducing scale free degree distributions was introduced by Barabasi and Albert (BA-model) 
Q. It is based on a growth algorithm with preferential attachment. A second older model which is also widelystudied 
in the context of scale-free networks is the configuration model treated by Molloy and Reed (MR- model) |g. This 
is to some extent the 'most random' model possessing a given degree distribution P(k) and a given number N of 
nodes. The building prescription starts with sets of NP(k) nodes with k stubs each. The stubs are then connected 
randomly to each other; two connected stubs form a link. Double bonds and autoconnections can be neglected in the 
' limit of large networks N — > oo. MR-networks exhibit an arbitrary degree distribution but no correlations, i.e. the 
l— ~~ ', distribution of nodes on one end of a link is independent from that on the other end of the link. For the BA-model 
there are non-trivial correlations between neighboring nodes. For both networks the correlations cannot be influenced 
by the construction algorithm. In contrast to that, real networks exhibit a wide range of different correlations. For 
example, some are assortative while others are dissortative [Tll[T^ | (i.e. nodes are generally attached to nodes 

with similar degree or not). It has been noted for some time, that the failure to include into the standard models this 
wide range of different correlations is a considerable defect of the models. So, a lot of effort has been put into the 
solution of this problem (e.g. [H Q H3). 

This publication aims to contribute to this effort. We introduce a general model for correlations, the most natural 
generalization of the MR- model. To our knowledge this model has not yet been proposed in the full generality as we 
present it here. On the other hand, it has been examined in considerable detail for correlations between neighboring 
■ nodes. The generalization from correlations between neigboring nodes to those of arbitrary range might be considered 
only a small advance. However, since there is no full knowledge about the actual correlations in natural networks, 
I i correlations of long range should not be neglected a priori. As simple examples can prove, sometimes only correlations 
of long range have a decisive influence on important topological properties like the percolation threshold. 

Thus far, an analytical expression for the percolation threshold has been derived only for a small class of random 
networks. In 0, ITa| MR-networks are treated. In the present manuscript, we derive a general result for random 
networks with arbitrary correlations, i.e. correlations between direct neighbors but also between second neighbors etc. 
Employing the set of Eq.s Q and © the percolation threshold can be calculated at least in principle. Related work 
can be found in [To| . There, Newman studies the influence of mixing on the size of the giant component by mimicking 
percolation with the change of a characteristic degree scale. In yjj Vazquez and Moreno calculate the percolation 
threshold for random networks with correlations between direct neighbors. In a completely different way, we will also 
derive this threshold in Sec. II I II Contrary to our method Vazquez and Moreno can calculate the size of the giant 
component. Finally, in 18] Boguna and Serrano present a general theory for percolation in directed random networks 
with general two point correlations and bidirectional links. 
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FIG. 1: An exemplary network. 



II. THE MODEL 



We distinguish different classes of networks by the range I — 1 of the correlations: P(ki + i\ki; k±) depends on k 2 , 
but not on k\. The distribution P{k\+\ \ki\ fci) denotes the probability that a randomly chosen path which has 
the node sequence with degrees ki, k 2 , ■ h is continued by a node with degree fc/+i. The normalization condition 
is: ^2 k P(fc x |fe x _i; fci) = 1. By a randomly chosen path we understand: 'The first link of the path is chosen 
uniformly at random. The following link is chosen at random under the condition that it is adjacent to the first etc' 
Consequently, the probability that a randomly chosen path of length I — 1 has the node sequence fci, k 2 , ki is 



T(ki; fci) = 



P{ki)h 


n p(k s \k s - 

s=1 


i; ki)(k s - 1) 


P(A,|A,_i;.. 


•;fci) 
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(1) 



The factor k\P(k\) is proportional to the probability for the node-degree fci at the beginning, i.e. the first position of 
the path. P(fc s |fc s _i; k\)(k s — 1) then denotes the probability that at position s < / in the path we find a node with 
degree k s . The last term P(fc;|fc;_i; fci) accounts for the last position in the path. This time, the normalization is: 
J2k t fci T(h; •••! &i) = 1- Of course, in all networks we need to have 

T(k l ;k l - 1 ;...;k 2 ;k 1 ) = T(kv,k 2 ;...;h-i;ki), VZ G IK. (2) 

Assuming a maximum range I — 1 of the correlations in the network, only the paths of length < I — 1 need to be 
considered. 

An example In Fig. ^ we see an exemplary network, in which we can measure the probabilities T. We can then 
determine recursively the corresponding P using Eq. Q. We measure: 



P(1)|P(2) 


P(3)|T(2;1) 


T(3;1)|T(1;2)|T(3;2) 


T(l;3) 


T(2;3) 


3/5 


1/5 


1/5 


1/8 


2/8 


1/8 


1/8 


2/8 


1/8 



Notice that we counted for each type of path both directions separately. Solving Eq. (JIJ for P(fci|fc2) then yields: 



P(2|l) 


P(3|l) 


P(l|2) 


P(3|2) 


P(l|3) 


P(2|3) 


1/3 


2/3 


1/2 


1/2 


2/3 


1/3 



In the same way we can proceed to measure elements T(k^ k 2 ; k\) etc. 

Generalized Maslov-Sneppen In |llLll9| | Maslov and Sneppen introduce a local rewiring algorithm, that randomizes 
correlated networks, while strictly preserving the degrees of the nodes. It yields networks with an identical topology 
as MR-networks. We propose a generalization of this MS-algorithm, that randomizes networks while preserving not 
only the degree distribution but also correlations up to a certain range. This algorithm should be especially useful 
in the examination, which correlations are important and which correlations can be neglected in real networks. For 
example, it will be helpful to answer a question of the following type: How far is the range of correlations that must 
be considered to determine the percolation threshold of a certain network up to a certain error? Fig. [21 presents 
the algorithm where the network is randomized regarding all correlations of range longer than one, i.e. only the 
correlations between next neighbors are kept: 

• Choose randomly one motif link-node (with node-degree k) in the network. 

• Choose another motif that has the same topology link-node (with the same degree k). 

• Exchange the parts attached to the end of the link. 

A motif consists of a small connected neighborhood and possibly dangling edges attaching to some nodes of the 
neighborhood. We see that in the process the distribution of paths of length one stays the same. However, the 
distributions of longer paths are randomized. 






FIG. 2: Generalized MS-algorithm: randomization conserving correlations of range up to one. 







FIG. 3: Generalized MS-algorithm: randomization conserving correlations of range up to two. 

This can easily be generalized to randomization where correlations of range up to x are preserved. Just exchange in 
the prescription above the motif link-node with a tree link-node-... -link- node of range x (i.e. each branch has x links 
and nodes). The rest of the process stays the same. For example, when the motif consists of a tree with range two, 
then the distribution of paths of length two stays the same. With respect to further-ranging correlations the network 
is again randomized (cp. Fig. yj. Finally, the known MS-algorithm is obtained, when the motif is only a link. Then 
only the degree distribution of the network and none of the correlations are preserved. 

Creating networks with next-neighbor correlations It is fairly easy to create a network with correlations only 
between next neighbors. Such a network is determined by the degree distribution and all correlations P(k\l), where 
k > I. All other P(k\l) are fixed either by the normalization condition ^ fe P(k\l) — 1 or by J5J), which amounts to: 
P(l)lP(k\l) = P(k)kP(l\k). Now, the network can be created in the following way: 

• We give N nodes distributed according to P(k). 

• We randomly connect the stubs of all nodes with degree 1 with stubs of other nodes according to P(k\l). 

• We link all nodes with degree 2 according to P(k\2) (with k > 2 of course) etc. 

According to this procedure it is much more difficult to construct networks with correlations ranging further. We 
want to shortly illustrate the difficulties arising from the generalization by examining the case of correlations of range 
two. In addition, we allow only nodes with degrees up to three in the network. In Fig. 21 we show how such a network 
is constructed. Step (1) in the figure is unproblematic, where starting from node k we build a chain using P(3|fe) 
and P(l\3; k) (cf. Eq. £[J). For step (2) however we have to respect that the degree to is determined by both of the 
node-degrees k and I. So, we have to introduce a new correlation function to determine to: P(m\3;k 1 l). Similar to 




X~(3-l)P[3|k] 




m=? 

P [m| 3; 1, k] 



X k 



(1) ^ (2) 

FIG. 4: Constructing a network with correlations of range two. 
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Eq. J3J, again several consistency equations need to be fulfilled: 

2NP(k)kP(3\k)P(l\3; k)P(m\3; I, k) = 
=2NP(k)kP(3\k)P(m\3; k)P(l\3; to, k) = 
=2NP(m)mP(3\m)P(k\3; m)P(l\3; k, to) = 
=2NP{m)mP(3\m)P(l\3; m)P(k\3; I, m) = 
=2NP(l)lP(3\l)P(m\3; l)P(k\3; m, I) = 

=2NP(l)lP(3\l)P(k\3; l)P(m\3; k, I). (3) 

Each term calculates the number of trihedrals in the network with a node of degree 3 as center and k, I, to as the 
degrees of the other nodes. Of course this number must not depend on the order in which we develop the nodes of 
the trihedral. The probability distribution for the trihedral motifs is denoted T(3;k,l,m). Eq. @ then corresponds 
to the fact that all permutations of k, I, to yield the same T(3; k, I, to). We get the following set of relations: 

P(k\3;l,m) = P(k\3;m,l) 
P(l\3;k)P(m\3;l,k) = P(m\3;k)P{l\3;m,k). 

Due to the symmetry of the problem, it is not surprising that we need only one of the correlation functions P(k\3; I, m). 
Now, we have all conditional probabilities necessary and can construct a network with correlations of range two and 
maximum degree three: 

• Construct all chains T(k; 2; /) according to Eq. Divide the amount of all resulting motifs by 2 x 1 (number 
the same motif will be developed by the correlation functions). 

• Construct all trihedrals T(3; m, I, k). Starting with Eq. we find all T(fc;3;/), then we continue with 
T(3; to, I, k) = P(m\3; I, k)T(l; 3; k). Divide the resulting number of all motifs by 3 x 2 x 1. 

• Combine all motifs randomly, e.g. a trihedral T(3; 2, 12, k 2 ) and a chain T{k\] 2; 3) or two chains T(ki; 2; 2) and 
T(2; 2; fc 2 ) or two trihedrals T(3; m u 3, fci) and T(3; 3, l 2 , k 2 ). 

A generalization of this concept is straight-forward, but gets complex quite fast. 

III. PERCOLATION CONDITION 
A. Theoretical derivation 

in mm Cohen et al. introduce as a percolation condition for the MR-model: 

A graph has a spanning cluster, when a site j, which 
is reached by following a link from site i on the giant 
cluster, has at least one other link on average. 

To be applicable to our correlated random networks we generalize this condition: 

A correlated graph is characterized by the distribu- 
tion of motifs, which fully determines the topology of 
the graph. A giant component exists, if motifs which 
are linked to the giant cluster have at least one other 
link (leaving the motif) on average. 

In our model with correlations of range I — 1 the network is characterized by the distribution of trees with branches 
of length I — 1 (Fig. [2J node- link- node and Fig. |3J nodc-link-node- link-node) . These trees are the motifs of our 
generalized random model with correlations of range up to I — 1. For the calculation of the percolation threshold p c 
we do not consider the whole trees as motifs but confine to chains of length I — 1 (i.e. / nodes and I — 1 links, cp. Fig. 
Et). All solutions obtained for p c are also solutions for the whole trees as motifs instead of the chains. We define the 
map 

f[T^ pc (k^ 1 ,k' l _ 1 ;...;k 1 ,k' 1 ;k ,k' )] := (k^ - 1) 

k ,k' 

x (1 - pcf'-^c^k^i, k[_ x ; fc l5 fci; ko, fc£)P(A#*-i; fci). (4) 
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FIG. 5: (a) Graphical representation of the map Eq. 0: The dotted circle moves away exactly one step from an arbitrarily 
chosen root of the giant cluster. The dotted circle on the left side circumscribes a sufficiently large connected part of the giant 
component. Now, only at the percolation transition, / reproduces on the right side exactly the same topology, including the 
same number of motifs and the same distribution, (b) An exemplary motif, a chain with degree sequence ko, ki, ki-i. A 
motif belongs to the dotted circle in (a) if its first node is in it. 



Here, T GC is the normalized probability distribution of chains in the giant component (cp. Fig. EJd) with degree se- 
quence ko, — , fcz-i in the original network and degree sequence k' , ... , k[_ 1 in the percolated network. P(ki\ki-x; fci) 

determines the degree ki of node I that is next in the chain. Again, ki is the original degree, and (SzJ") {^—Pc) k,1 Pc k ' 
determines the probability that the percolated degree is k[. At last, the factor k[_ 1 — 1 accounts for the number of 
chains developed due to the number of outgoing links of node I — 1. In Fig.EK we see a graphical representation of 
this map (@J . Due to the percolation condition introduced at the beginning of this section, at the threshold p c the 
following relations must be fulfilled: 

1. On the right side there must be the same distribution of motifs in the dotted circle as on the left side. 

2. On the right side there must be the same number of motifs in the dotted circle as on the left side. 
These conditions translate into the following set of equations: 

/[^GC,pc(^-l) H— i'i ki> k 'x'i ^0> ^o)] = ^GC,pe(^> H'i ^2, k' 2 ; ki, fci), (5) 



( k 'l-l ~ 1 ) T Gapc( fc i-i I ^'-i;-; fc o,fco) = !■ ( 6 ) 

&0 5^0 ) — j^i — It 1 

Notice, that in the first equation the distribution on the right side is only normalized in case that Eq. © is fulfilled. 
Eq. © states that the number of motifs, which a motif connects to, is one in average. A short derivation yields the 
following set of equations: 

y^(l -p c )(h - l)T GC , P c(ki-i; fc 2 ; fci)-P(fcz|fcz_i; fci) = T GC , P c(ki; fc 3 ; k 2 ), (7) 
fci 

X] T Gc, P c(h-i;-;h;h) = i- (8) 
feii-.-ifei-i 

Notice, that the T are different from the T* in Eq.s © and JSJ. Here, we interpret the T only as auxiliary variables 
(the definition of it can be seen when comparing Eq.s |jfj| and ijH}). Eq.s J7J and JSJ) are a system of K l ~ 1 + 1 equations, 
where K is the number of different degrees in the considered network and Z — 1 is the range of the correlations. There 
are K 1 ^ 1 + 1 variables: p c and TGC,pc(ki-i; fci), where fci,...,fc;-i take on the values of all K degrees. Thus, the set 
of equations Q) and JSJl allows to determine p c . 

Node or link removal Considering these generalized random networks in the thermodynamic limit, the percolation 
threshold is the same for node and edge percolation. Edge percolation removes the fraction p of all links from the 
network. Node percolation removes p of all nodes from the network, i.e. the network size is changed from TV nodes to 
(1 — p) N nodes, and from the remaining (1 — p)N nodes those p links are removed, which connect to removed nodes. 
We can thus formulate the relation between the percolated degree distribution under link removal Pi r {k' , k) and that 
under node removal P nr (k' , k): 



Pnr{k', *) = (!" p)Plr(k', k) + p 5 k ,, P(k). 



(9) 
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Here, k is the original degree and kl the percolated. The probability, that the link between node x-1 and node x is 
removed in edge percolation, is p. In node percolation we have the same probability p, that the node x is removed 
and consequently also the link connecting node x-1 with x. All conditional probabilities are the same, e.g. 

P n r(k x \k x -i; fci) = Pir(k x \k x -i; h). (10) 

These are all original degrees. We see, that the distribution of motifs is the same at every p for both networks with 
different percolation procedures. Only the network size differs by a factor 1 — p due to ©. Hence, both procedures 
yield the same percolation threshold. 



B. Examples 



Exact ones We have tested the Eq.s (Q) and JSJl with different networks using Mathematica® : 

1. P(k) = |<5 fe) i + §4,2 + §4,3 and 

P(k\l) = 4x4,3 + 4 2 (|4,2 + §4,s) + 4 3 (;j4,i + \5 k ,2 + §4, 3 ) 

2. P(k) = |5 fc ,i + ±4,2 + §4,3 and 

P(k\l) = 4x4,2 + SlAh 6 k,i + |*m) + <^3(§4,2 + |4, 3 ) and 

P(k\l; m) = 8 m .i8u 2 5k3 + S m , 2^,3(3^,2 + §4,3) + *m,3^,2^fc,i + ^,3^,3(^4,2 + §4,3) 
We find for the networks: 

2-Pc = \ 

To check if our algorithm yields the correct results, we use the correspondence of these networks to MR-networks: 



1. P{k) = 



2 \3/ 



'fc,3 



31 (2\ 2 
2 3 V 3/ 



4 



fe,2 



1 f 1 

2 V3 



(I) 3 



2. P(fc) = (f) 3 4,3 + [3| (|) 2 ] 4,2 + 3§ (i) 2 4,i + (|) 3 4,o 

The first example exhibits no correlations except that nodes with degree k = 1 connect only to those with degree 
k = 3. This means that with probability 1/3[P(1)/P(3)] = 1/3 a stub of a node with degree 3 is blocked. We can 
replace each node with s blocked stubs by a node with degree k = 3 — s. In this 'new' network there are no correlations 
at all. For this MR- network the percolation threshold is calculated according to 0, ^| 



l-Pc 



1 



1M> 

<feo> 



- 1 



(11) 



The result p c — 1/7 corresponds. In the second example again stubs are blocked, but this time by chains consisting 
of one node with degree k = 2 and one with degree k = 1. Again, there are no other correlations. We find the correct 
Pc = l/4. 

MR-model Our method yields the threshold of the MR-network correctly. There, the trivial correlations are 



P(fc,|fc,_i;...;fci) 



hP{h 



(k) 



Employing this in (JJ, summing over &2 and ki we get with J5J) 



hP{h) ± 1 



(k) 



(12) 



(13) 



This is identical with 111(1. the correct percolation threshold for uncorrelated random networks. 
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0368328 3951 427495 

percentage of nodes removed 

FIG. 6: Percolation in a random network with pair correlations. The network size was 100,000 nodes. For all networks we gave 
the degree distribution as P(l) = P(2) = P(3) = 1/3 and the correlations P(2|l) = 0.3, P(3|l) = 0.2. The different curves 
represent different correlations P(3|2), which takes on the values 0.1, 0.2, and 0.3. All points are averaged over six runs. The 
crosses on the x-axis are the theoretical values for the percolation thresholds calculated according to Eq.s Q an d JBJ- 




FIG. 7: An example for an algorithm, that decreases the power-law exponent 7 of the degree distribution by one without 
increasing the percolation threshold: All nodes i in the network with degree ki are replaced by neighborhoods of fc; nodes with 
degree ki . For each node ki — 1 of its links connect to the other ki — 1 nodes in the neighborhood and one link connects to the 
outside of the neighborhood. 

Simulations We also tested the algorithm numerically using correlations that cannot be mapped on an uncorrelated 
network (view Fig. 0) . We measured the size of the giant component in percolated networks with different correlation 
profiles. The size of the networks we created was 100,000 nodes. The degree distribution of the original network at 
p = was always the same P(l) = -P(2) = -P(3) = 1/3, also the neighbor correlations -P(2|l) = 0.3 and P(3|l) = 0.2. 
For P(3|2) we employed the values 0.1, 0.2, and 0.3. As noted before, all other correlations are determined by these. 
We calculated for the percolation transition the values: 0.368, 0.395, and 0.427 respectively. With the completely 
different method of Vazquez and Moreno |17| the same results were obtained. In Fig.[H]these values correspond quite 
well with the simulations. Of course, it is impossible to determine exactly the p c from the simulations. 

Note We have seen in the exact examples above 1.) and 2.) that correlations play a role for the percolation 
properties of networks, since both networks have the same degree distribution, but different correlations and different 
percolation thresholds. For the uncorrelated network with the same degree distribution we get according to (|llf> : p c — 
1/4. Example 1.) yields a smaller p c = 1/7 due to the dissortativity of the network: nodes with degree k = 1 only link 
to nodes with degree k = 3. An example for assortative pair correlations would be: P{k\l) — 5i l sSk l z + $i,2^fc,2 + <^,i<5fc,i' 
There are two giant components in such a network and thus two different thresholds p c .\ =0 and p Cy 2 = \- Thus, 
there exists a giant component for all p < 1/2, which is then by definition the value of the percolation threshold for 
the whole network. That means that assortativity yields a higher p c in comparison to the uncorrelated network. 

IV. OUTLOOK: LOOPS AND CLUSTERING 

In this section, we show that the percolation properties depend not only on the degree distribution and the degree 
correlations, but also on the length of typical loops in the network. Note that loops cannot be sufficiently described by 
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degree correlations (for our generalized correlated random model, the appearence of certain loops depends crucially on 
the network size) . This problem is also studied in the context of embedding a network into a geography pll I22I |23| . 

We consider the following example: a MR- network with a power-law degree distribution Pi(k) with an exponent 
7 > 3. This network has a percolation threshold p c ^ < 1 (cf. Eg. Illfl . Replacing links by motifs link-node- link, we 
integrate new nodes in the network in a manner that the resulting degree distribution Pf(k) follows a power-law with 
an exponent 7 < 3. We consider these two possibilities: 

1. We integrate the new nodes in a way, that the resulting network corresponds to a MR- model. 

2. We assign a certain number TVj of the new nodes to each existing node i with degree ki. With those Ni nodes 
we build a neighborhood around i that has exactly ki outgoing links (cf. for example Fig.d). 

This yields the following percolation thresholds: 

1. p c = 1, because the second moment of the degree distribution Pf(k) diverges. 

2. p c < 1, because the resulting network with degree distribution Pf(k) has a superstructure: a network with 
degree distribution Pi{k) where the nodes are finite-size neighborhoods. This superstructure determines an 
upper boundary for the percolation threshold. 

Note that it might well be possible to ensure that both networks also have the same degree correlations. So, apparently 
it plays an important role for the value of the percolation threshold, if additional links are only added to a neighborhood 
or connect distant parts of the network. Consequently, it is an interesting question, how many links of the Internet 
or social networks connect only nodes that already belong to the same neighborhood. We stress that contrary to the 
BA- and MR-models real networks seldom exhibit local tree-structure, indicating redundant linking in neighborhoods. 
This points to common interests or properties of nodes. For example in social networks these may be: geographic 
location, language, age or level of education. 

A. Extension of the BA-model 

It is an obvious failure of the BA-model, that it does not implement clustering. The clustering coefficient tends to 
zero in the thermodynamic limit (number of triangles divided by number of pairs of adjacent edges, i.e. edges with at 
least one identical endpoint). We suggest here a simple extension of the BA-model that allows to influence clustering, 
while it preserves the degree distribution of the BA-model. 

We assume that every new node added to the network brings with it m — 2 proper links. These proper links connect 
the new node with nodes in the network according to different criteria. For example, in a friendship-network, every 
individual would have the right to choose two friends. The first friend he chooses from people who do the same job 
as he. The second friend he chooses from people who have the same favorite hobby. Both times he preferably chooses 
those people that already have a lot of friends (i.e. preferential attachment). 

The new feature compared with the BA-model is, that at its introduction we assign to each node i two parameters, 
a job-parameter < p^i < 1 and a hobby-parameter < ph,i < 1- Each new node has a job- link and a hobby-link. 
Now, according to preferential attachment we first determine the degree k\ of node 1, that the job-link shall attach 
to. Then we search for that node 1 that has the Pj t \ closest to pj i, corresponding to the best matching of common 
interests. The same procedure determines node 2, that the hobby-link of i attaches to. 

Qualitatively, the clustering depends on the correlations between the parameters pj.i and Ph,i- We choose pjj 
uniformly at random, p^j is chosen depending on the value pj^. There are two limiting cases: 

1. The choice of ph,i is independent from the choice of pj^. Then our model corresponds exactly to the BA-model 
with m = 2 and exhibits a vanishing clustering coefficient. 

2. p} lt i — Pj t i — pi. Then the clustering is maximal. 

For the second case, when the degrees k\ and k% of nodes 1 and 2 are equal a double bond is formed. If as an additional 
rule we prohibit double bonds, the second link shall be connected to a node 2 with the parameter closest to pi but 
unequal node 1. In this model the clustering is maximal, because the probability that nodes 1 and 2 are neighbors is 
maximal. In the case that 1 and 2 are neighbors a new triangle is formed in the network. 

Another important feature of the second limiting case is that the clustering coefficient is independent of the network 
size. This is proven in the following way: The probability, that a node x of degree k x has a neighbor of degree k y , 
depends only on the degrees of the nodes and not on the network size. If node x has a neighbor y of degree k y , 
then the distribution for the difference in parameters P(\p x ~Py\/N) — P(\ph.x ~ Ph,y\/N) — P{\pj }X —Pj, y \/N) is 
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independent of the network-size. This guarantees that the probability, that the two nodes 1 and 2 (in the growth 
algorithm presented above) are connected, is independent of the network size. Thus, the probability, that with a new 
node also a new triangle is added to the network is independent of the network size. From the fact, that with every 
node three pairs of adjacent edges are added to the network and with a constant probability sometimes a triangle, it 
follows, that the clustering coefficient is independent of N (at least for N large). 

A generalization of this model is straightforward. Let P(k x ,k y , \pj, x — Ph,y\/N) be the probability, that two nodes 
x and y with degrees k x and k y and with difference in parameters \pj. x — Ph.y\ are neighbors. If the parameter ph t i 
for the new nodes is chosen such, that P(k x ,k y , \pj lX — Ph, y \/N) stays constant with growing network size N, then 
the clustering coefficient is approximately independent of N. This should be the case for a scale-invariant probability 
distribution for the choice of p^i for node i: P{phi) — f (\Ph,i ~Pji\/N) with some arbitrary function /. Let us 
finally note, that this is a plausible condition for friendship networks, but also for many other networks. The pool of 
possible friends, from which a new person can choose, is always independent of the network size. When somebody 
comes to a new city to make friends, for his choice, it does not matter, how many people are living on the whole earth. 
He will choose his friends from his immediate neighborhood (mathematically these neighborhoods are characterized 
by similarity in the parameters p). The only difference is that with growing network size, i.e. population on the earth, 
the parameter interval is renormalized. The parameters of people in one city then lie closer together, the differences 
are simply scaled. Empirically, the network size N does not influence the person's choice of friends, though. 

The parameters pj and ph can of course be identified with two-dimensional coordinates in a geography. A general- 
ization to more than two parameters is straight-forward. In summary we can say that the clustering in our example 
network is larger, when people with the same job tend to have the same hobbies. This very general outline was 
supposed to show the origin of network-size-invariant clustering and appearance of neighborhoods (with loops) in real 
networks. 



V. CONCLUSION 

We discussed a general model for correlated random networks - where correlations can have arbitrary range. We 
presented two algorithms to produce or examine those networks. One, a generalization of a randomization algorithm 
by Maslov and Sneppen, should be helpful in examining the influence of correlations in real networks. Especially 
the question, which range of correlations is important, can be answered. We derived a set of equations (JJJ) and (JHJ) 
determining the percolation threshold for this model and verified the result by different methods. In the end we 
added a few general remarks concerning clustering - a measure of the amount of triangles in a network - and more 
generally concerning the existence of loops. For both correlations and loops their influence on topological properties 
like the percolation threshold of a network is not yet fully understood. We suggested very shortly a mechanism for 
the emergence of network-size-invariant clustering in real networks. 
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